Title Page

Column {.tabset}

Project Title

Social Media Usage Analysis

Team Members

Scot Swanson

Introduction

Row {.tabset}

Project Overview

This project investigates social media usage metrics, focusing on engagement patterns across various platforms. By analyzing metrics such as daily time spent, posts, likes, and follows, we aim to gain insights into user behavior and engagement trends across platforms like Instagram, Facebook, and Twitter.

Research Background and Significance

Social media has become an integral part of daily life, influencing communication, culture, and commerce. Analyzing social media engagement helps companies and researchers understand user behavior, optimize platform content, and potentially improve user experience. This study is significant because it sheds light on the factors contributing to higher engagement, which can benefit marketers, advertisers, and platform developers.

Research Questions

This analysis aims to address the following research questions: - Which social media platform has the highest average daily usage among users? - What is the relationship between likes and follows across different platforms? - How does time spent correlate with follows per day?

Data Source and Collection

The dataset used for this project is publicly available on Kaggle: Social Media Usage Dataset. This dataset includes detailed metrics on daily social media activity, covering platforms such as Instagram, Facebook, and Twitter. These metrics provide insights into posts, likes, follows, and time spent on each platform, allowing us to answer our research questions.

Data Loading and Cleaning

  User_ID       App Daily_Minutes_Spent Posts_Per_Day Likes_Per_Day
1     U_1 Pinterest                 288            16            94
2     U_2  Facebook                 192            14           117
3     U_3 Instagram                 351            13           120
4     U_4    TikTok                  21            20           117
5     U_5  LinkedIn                 241            16             9
6     U_6   Twitter                 464             3           137
  Follows_Per_Day Engagement
1               0        110
2              15        146
3              48        181
4               8        145
5              21         46
6              30        170

Detailed Data Cleaning and Manipulation Process

To prepare the data for analysis, we followed these data cleaning steps: - Loaded the Data: Read data from a CSV file. - Removed Missing Values: Filtered out rows with missing data to ensure clean analysis. - Feature Engineering: Created a new “Engagement” variable by summing the daily counts of posts, likes, and follows. This provides an overall metric for user activity.

Summary Statistics

Row {.tabset}

Distribution of Daily Minutes Spent

Average Daily Usage by Platform

Correlation Analysis

Correlation Matrix of Engagement Metrics

Exploration

Row {.tabset}

Engagement Across Platforms

Time Spent vs. Follows Per Day

Discussion

Row {.tabset}

Key Findings and Analysis

The analysis reveals interesting insights into social media usage across platforms: - Platform with Highest Engagement: Based on total engagement, we can see which platform has the highest user activity. - Likes and Follows Relationship: There is a visible relationship between likes and follows, indicating how social connections influence platform engagement. - Time Spent and Follows: Our analysis shows how daily minutes spent on platforms correlates with follows per day, providing insights into user interactions.

Limitations

While the analysis provides valuable insights, there are limitations to consider: - Dataset Scope: The data may not capture all social media platforms or all types of user interactions. - Engagement Calculation: Our engagement metric is a simple sum and may not reflect nuanced interactions or platform-specific dynamics.

Future Work

Future studies could include: - Time-Series Analysis: Explore how engagement metrics change over time. - User Demographics: Analyze engagement patterns by user demographics to gain targeted insights.

References

---
title: "Social Media Usage Analysis"
author: "Scot Swanson"
output:
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: fill
    theme:
      bootswatch: zephyr
    source_code: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(plotly)
library(DT)
library(ggplot2)
```

# Title Page

Column {.tabset}

### Project Title
**Social Media Usage Analysis**

### Team Members
Scot Swanson

# Introduction

Row {.tabset}

### Project Overview

This project investigates social media usage metrics, focusing on engagement patterns across various platforms. By analyzing metrics such as daily time spent, posts, likes, and follows, we aim to gain insights into user behavior and engagement trends across platforms like Instagram, Facebook, and Twitter.

### Research Background and Significance

Social media has become an integral part of daily life, influencing communication, culture, and commerce. Analyzing social media engagement helps companies and researchers understand user behavior, optimize platform content, and potentially improve user experience. This study is significant because it sheds light on the factors contributing to higher engagement, which can benefit marketers, advertisers, and platform developers.

### Research Questions

This analysis aims to address the following research questions:
- Which social media platform has the highest average daily usage among users?
- What is the relationship between likes and follows across different platforms?
- How does time spent correlate with follows per day?

### Data Source and Collection

The dataset used for this project is publicly available on Kaggle: [Social Media Usage Dataset](https://www.kaggle.com/datasets/bhadramohit/social-media-usage-datasetapplications). This dataset includes detailed metrics on daily social media activity, covering platforms such as Instagram, Facebook, and Twitter. These metrics provide insights into posts, likes, follows, and time spent on each platform, allowing us to answer our research questions.

# Data Loading and Cleaning

```{r}
# Load the social media dataset
data <- read.csv("social_media_usage.csv")

# Data Cleaning Steps:
# 1. Remove any rows with missing values for cleaner analysis
data <- na.omit(data)

# 2. Feature Engineering: Create an "Engagement" variable to represent daily user activity as the sum of posts, likes, and follows
data <- data %>%
  mutate(Engagement = Posts_Per_Day + Likes_Per_Day + Follows_Per_Day)

# Display the first few rows of the cleaned data for verification
head(data)
```

### Detailed Data Cleaning and Manipulation Process

To prepare the data for analysis, we followed these data cleaning steps:
- **Loaded the Data**: Read data from a CSV file.
- **Removed Missing Values**: Filtered out rows with missing data to ensure clean analysis.
- **Feature Engineering**: Created a new "Engagement" variable by summing the daily counts of posts, likes, and follows. This provides an overall metric for user activity.

# Summary Statistics

Row {.tabset}

### Distribution of Daily Minutes Spent

```{r}
ggplot(data, aes(x=Daily_Minutes_Spent)) +
  geom_histogram(bins=20, fill="#377eb8", color="black", alpha=0.8) +
  labs(title="Distribution of Daily Minutes Spent on Social Media",
       x="Daily Minutes Spent", y="Frequency") +
  theme_minimal() +
  theme(plot.title = element_text(size=16, face="bold"),
        axis.title = element_text(size=14),
        axis.text = element_text(size=12))
```

### Average Daily Usage by Platform

```{r}
avg_usage <- data %>%
  group_by(App) %>%
  summarize(avg_daily_minutes = mean(Daily_Minutes_Spent)) %>%
  arrange(desc(avg_daily_minutes))

ggplot(avg_usage, aes(x=reorder(App, -avg_daily_minutes), y=avg_daily_minutes)) +
  geom_bar(stat="identity", fill="#4daf4a", color="black") +
  labs(title="Average Daily Minutes Spent by Platform",
       x="Platform", y="Average Daily Minutes") +
  theme_minimal() +
  theme(plot.title = element_text(size=16, face="bold"),
        axis.title = element_text(size=14),
        axis.text = element_text(size=12),
        axis.text.x = element_text(angle = 45, hjust = 1))
```

# Correlation Analysis

### Correlation Matrix of Engagement Metrics

```{r}
correlation_matrix <- cor(data %>% select(Daily_Minutes_Spent, Posts_Per_Day, Likes_Per_Day, Follows_Per_Day, Engagement))
plot_ly(
  z = ~correlation_matrix,
  x = colnames(correlation_matrix),
  y = rownames(correlation_matrix),
  type = "heatmap",
  colorscale = "Viridis"
) %>%
  layout(title = "Correlation Matrix for Social Media Usage Metrics",
         titlefont = list(size = 16))
```

# Exploration

Row {.tabset}

### Engagement Across Platforms

```{r}
engagement_by_platform <- data %>%
  group_by(App) %>%
  summarize(total_engagement = sum(Engagement)) %>%
  arrange(desc(total_engagement))

ggplot(engagement_by_platform, aes(x=reorder(App, -total_engagement), y=total_engagement)) +
  geom_bar(stat="identity", fill="#FF7F0E", color="black") +
  labs(title="Total Engagement Across Social Media Platforms",
       x="Platform", y="Total Engagement") +
  theme_minimal() +
  theme(plot.title = element_text(size=16, face="bold"),
        axis.title = element_text(size=14),
        axis.text = element_text(size=12),
        axis.text.x = element_text(angle = 45, hjust = 1))
```

### Time Spent vs. Follows Per Day

```{r}
ggplot(data, aes(x=Daily_Minutes_Spent, y=Follows_Per_Day, color=App)) +
  geom_point(size=3, alpha=0.6) +
  labs(title="Correlation Between Time Spent and Follows Per Day",
       x="Daily Minutes Spent", y="Follows Per Day") +
  theme_minimal() +
  theme(plot.title = element_text(size=16, face="bold"),
        axis.title = element_text(size=14),
        axis.text = element_text(size=12))
```

# Discussion

Row {.tabset}

### Key Findings and Analysis

The analysis reveals interesting insights into social media usage across platforms:
- **Platform with Highest Engagement**: Based on total engagement, we can see which platform has the highest user activity.
- **Likes and Follows Relationship**: There is a visible relationship between likes and follows, indicating how social connections influence platform engagement.
- **Time Spent and Follows**: Our analysis shows how daily minutes spent on platforms correlates with follows per day, providing insights into user interactions.

### Limitations

While the analysis provides valuable insights, there are limitations to consider:
- **Dataset Scope**: The data may not capture all social media platforms or all types of user interactions.
- **Engagement Calculation**: Our engagement metric is a simple sum and may not reflect nuanced interactions or platform-specific dynamics.

### Future Work

Future studies could include:
- **Time-Series Analysis**: Explore how engagement metrics change over time.
- **User Demographics**: Analyze engagement patterns by user demographics to gain targeted insights.

# References

- Kaggle. Social Media Usage Dataset. Available at: [https://www.kaggle.com/datasets/bhadramohit/social-media-usage-datasetapplications](https://www.kaggle.com/datasets/bhadramohit/social-media-usage-datasetapplications)
- Tidyverse Documentation. Comprehensive R Packages for Data Science.
- Plotly Documentation. Interactive Plots in R.